This is a classification data set that comes with the NeurEco installation. 
It is a collection of data that is part of the RNA-Seq (HiSeq) PANCAN data set, it is a random extraction of gene expressions (giving :math:`20531` input features), of patients having different types of tumors (:math:`5` output features): BRCA, KIRC, COAD, LUAD and PRAD.
Each input is given a dummy name (gene_xx), while the targets are the cancer classes: BRCA, KIRC, COAD, LUAD and PRAD. 

The test case is provided with the following files:
  
* Training data set:

  * x_train_0.csv: the training inputs file - part 1, containing :math:`320` samples
  * y_train_0.csv: the training targets file - part 1
	
  * x_train_1.csv: the training inputs file - part 2, containing :math:`320` samples
  * y_train_1.csv: the training targets file - part 2

* testing data set:

  * x_test.csv: the testing inputs file, containing :math:`161` samples
  * y_test.csv: the testing targets file

